RDFAdaptor: Efficient ETL Plugins for RDF Data Process
نویسندگان
چکیده
Abstract Purpose The interdisciplinary nature and rapid development of the Semantic Web led to mass publication RDF data in a large number widely accepted serialization formats, thus developing out necessity for processing with specific purposes. paper reports on an assessment chief endpoint challenges introduces Adaptor, set plugins which covers whole life-cycle high efficiency. Design/methodology/approach RDFAdaptor is designed based prominent ETL tool—Pentaho Data Integration—which provides user-friendly intuitive interface allows connect various sources reuses Java framework RDF4J as middleware that realizes access repositories, SPARQL endpoints all leading database solutions 1.1 support. It can support effortless services configuration templates multi-scenario applications, help extend process tasks other or tools complement missing functions. Findings proposed comprehensive solution—RDFAdaptor—provides easy-to-use interface, supports integration federation over multi-source heterogeneous repositories endpoints, well manage linked hybrid storage mode. Research limitations plugin several application scenarios process, but error detection/check interaction graph remain be improved. Practical implications provide user enable its usability applications generation, multi-format conversion, remote migration, update semantic query process. Originality/value This first attempt develop components instead systems include extract, consolidate, store basis ecologically mature warehousing environment.
منابع مشابه
UnifiedViews: An ETL Tool for RDF Data Management
We present UnifiedViews, an Extract-TransformLoad (ETL) framework that allows users to define, execute, monitor, debug, schedule, and share data processing tasks, which may employ custom plugins (data processing units) created by users. UnifiedViews natively supports processing of RDF data. In this paper, we: (1) introduce UnifiedViews’ basic concepts and features, (2) demonstrate the maturity ...
متن کاملUnifiedViews: An ETL Framework for Sustainable RDF Data Processing
We present UnifiedViews, an Extract-Transform-Load (ETL) framework that allows users to define, execute, monitor, debug, schedule, and share ETL data processing tasks, which may employ custom plugins created by users. UnifiedViews differs from other ETL frameworks by natively supporting RDF data and ontologies. We are persuaded that UnifiedViews helps RDF/Linked Data consumers to address the pr...
متن کاملEfficient RDF Interchange (ERI) Format for RDF Data Streams
RDF streams are sequences of timestamped RDF statements or graphs, which can be generated by several types of data sources (sensors, social networks, etc.). They may provide data at high volumes and rates, and be consumed by applications that require real-time responses. Hence it is important to publish and interchange them efficiently. In this paper, we exploit a key feature of RDF data stream...
متن کاملEfficient Parallel Dictionary Encoding for RDF Data
The Semantic Web comprises enormous volumes of semi-structured data elements. For interoperability, these elements are represented by long strings. Such representations are not efficient for the purposes of Semantic Web applications that perform computations over large volumes of information. A typical method for alleviating the impact of this problem is through the use of compression methods t...
متن کاملQuery Optimizer for the ETL Process in Data Warehouses
ETL (Extraction-Transformation-Loading) process is responsible for extracting data from several sources, cleansing, transforming, integrating and loading into a data warehouse. Extraction process accesses large amount of data by executing several complex queries in source databases. These queries are repetitive and executed at regular interval to refresh the data warehouse. Extraction of data f...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of Data and Information Science
سال: 2021
ISSN: ['2096-157X', '2543-683X']
DOI: https://doi.org/10.2478/jdis-2021-0020